home *** CD-ROM | disk | FTP | other *** search
Text File | 1995-03-27 | 16.5 KB | 418 lines | [TEXT/ROSA] |
- Common Lisp the Language, 2nd Edition
- -------------------------------------------------------------------------------
-
- 18. Strings
-
- A string is a specialized vector (one-dimensional array) whose elements are
- characters.
-
- [old_change_begin]
- Specifically, the type string is identical to the type (vector string-char),
- which in turn is the same as (array string-char (*)).
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type
- string-char and to redefine the type string to be the union of one or more
- specialized vector types, the types of whose elements are subtypes of the type
- character.
- [change_end]
-
- Any string-specific function defined in this chapter whose name begins with the
- prefix string will accept a symbol instead of a string as an argument provided
- that the operation never modifies that argument; the print name of the symbol
- is used. In this respect the string-specific sequence operations are not
- simply specializations of generic versions; the generic sequence operations
- described in chapter 14 never accept symbols as sequences. This slight
- inelegance is permitted in Common Lisp in the name of pragmatic utility. One
- may get the effect of having a generic sequence function operate on either
- symbols or strings by applying the coercion function string to any argument
- whose data type is in doubt.
-
- [change_begin]
- Note that this remark, predating the design of the Common Lisp Object System,
- uses the term ``generic'' in a generic sense and not necessarily in the
- technical sense used by CLOS (see chapter 2).
- [change_end]
-
- Also, there is a slight non-parallelism in the names of string functions. Where
- the suffixes equalp and eql would be more appropriate, for historical
- compatibility the suffixes equal and = are used instead to indicate
- case-insensitive and case-sensitive character comparison, respectively.
-
- Any Lisp object may be tested for being a string by the predicate stringp.
-
- Note that strings, like all vectors, may have fill pointers (though such
- strings are not necessarily simple). String operations generally operate only
- on the active portion of the string (below the fill pointer). See fill-pointer
- and related functions.
-
- -------------------------------------------------------------------------------
-
- * String Access
- * String Comparison
- * String Construction and Manipulation
-
- -------------------------------------------------------------------------------
-
- 18.1. String Access
-
- The following functions access a single character element of a string.
-
- [Function]
- char string index
- schar simple-string index
-
- The given index must be a non-negative integer less than the length of string,
- which must be a string. The character at position index of the string is
- returned as a character object. (This character will necessarily satisfy the
- predicate string-char-p.)
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate string-char-p.
- [change_end]
-
- As with all sequences in Common Lisp, indexing is zero-origin. For example:
-
- (char "Floob-Boober-Bab-Boober-Bubs" 0) => #\F
- (char "Floob-Boober-Bab-Boober-Bubs" 1) => #\l
-
- See aref and elt. In effect,
-
- (char s j) == (aref (the string s) j)
-
- setf may be used with char to destructively replace a character within a
- string.
-
- For char, the string may be any string; for schar, it must be a simple string.
- In some implementations of Common Lisp, the function schar may be faster than
- char when it is applicable.
-
- -------------------------------------------------------------------------------
-
- 18.2. String Comparison
-
- The naming conventions for these functions and for their keyword arguments
- generally follow the conventions for the generic sequence functions (see
- chapter 14).
-
- [change_begin]
- Note that this remark, predating the design of the Common Lisp Object System,
- uses the term ``generic'' in a generic sense and not necessarily in the
- technical sense used by CLOS (see chapter 2).
- [change_end]
-
- [Function]
- string= string1 string2 &key :start1 :end1 :start2 :end2
-
- string= compares two strings and is true if they are the same (corresponding
- characters are identical) but is false if they are not. The function equal
- calls string= if applied to two strings.
-
- The keyword arguments :start1 and :start2 are the places in the strings to
- start the comparison. The arguments :end1 and :end2 are the places in the
- strings to stop comparing; comparison stops just before the position specified
- by a limit. The ``start'' arguments default to zero (beginning of string), and
- the ``end'' arguments (if either omitted or nil) default to the lengths of the
- strings (end of string), so that by default the entirety of each string is
- examined. These arguments are provided so that substrings can be compared
- efficiently.
-
- string= is necessarily false if the (sub)strings being compared are of unequal
- length; that is, if
-
- (not (= (- end1 start1) (- end2 start2)))
-
- is true, then string= is false.
-
- (string= "foo" "foo") is true
- (string= "foo" "Foo") is false
- (string= "foo" "bar") is false
- (string= "together" "frog" :start1 1 :end1 3 :start2 2)
- is true
-
- [change_begin]
- X3J13 voted in June 1989 (STRING-COERCION) to clarify string coercion (see
- string).
- [change_end]
-
- -------------------------------------------------------------------------------
- Compatibility note: string= is called strequal in Interlisp.
- -------------------------------------------------------------------------------
-
- [Function]
- string-equal string1 string2 &key :start1 :end1 :start2 :end2
-
- string-equal is just like string= except that differences in case are ignored;
- two characters are considered to be the same if char-equal is true of them. For
- example:
-
- (string-equal "foo" "Foo") is true
-
- [change_begin]
- X3J13 voted in June 1989 (STRING-COERCION) to clarify string coercion (see
- string).
- [change_end]
-
- [Function]
-
- string< string1 string2 &key :start1 :end1 :start2 :end2
- string> string1 string2 &key :start1 :end1 :start2 :end2
- string<= string1 string2 &key :start1 :end1 :start2 :end2
- string>= string1 string2 &key :start1 :end1 :start2 :end2
- string/= string1 string2 &key :start1 :end1 :start2 :end2
-
- These functions compare the two string arguments lexicographically, and the
- result is nil unless string1 is respectively less than, greater than, less than
- or equal to, greater than or equal to, or not equal to string2. If the
- condition is satisfied, however, then the result is the index within the
- strings of the first character position at which the strings fail to match; put
- another way, the result is the length of the longest common prefix of the
- strings.
-
- A string a is less than a string b if in the first position in which they
- differ the character of a is less than the corresponding character of b
- according to the function char<, or if string a is a proper prefix of string b
- (of shorter length and matching in all the characters of a).
-
- The keyword arguments :start1 and :start2 are the places in the strings to
- start the comparison. The keyword arguments :end1 and :end2 are the places in
- the strings to stop comparing; comparison stops just before the position
- specified by a limit. The ``start'' arguments default to zero (beginning of
- string), and the ``end'' arguments (if either omitted or nil) default to the
- lengths of the strings (end of string), so that by default the entirety of each
- string is examined. These arguments are provided so that substrings can be
- compared efficiently. The index returned in case of a mismatch is an index into
- string1.
-
- [change_begin]
- X3J13 voted in June 1989 (STRING-COERCION) to clarify string coercion (see
- string).
- [change_end]
-
- [Function]
-
- string-lessp string1 string2 &key :start1 :end1 :start2 :end2
- string-greaterp string1 string2 &key :start1 :end1 :start2 :end2
- string-not-greaterp string1 string2 &key :start1 :end1 :start2 :end2
- string-not-lessp string1 string2 &key :start1 :end1 :start2 :end2
- string-not-equal string1 string2 &key :start1 :end1 :start2 :end2
-
- These are exactly like string<, string>, string<=, string>=, and string/=,
- respectively, except that distinctions between uppercase and lowercase letters
- are ignored. It is as if char-lessp were used instead of char< for comparing
- characters.
-
- [change_begin]
- X3J13 voted in June 1989 (STRING-COERCION) to clarify string coercion (see
- string).
- [change_end]
-
- -------------------------------------------------------------------------------
-
- 18.3. String Construction and Manipulation
-
- Most of the interesting operations on strings may be performed with the generic
- sequence functions described in chapter 14. The following functions perform
- additional operations that are specific to strings.
-
- [change_begin]
- Note that this remark, predating the design of the Common Lisp Object System,
- uses the term ``generic'' in a generic sense and not necessarily in the
- technical sense used by CLOS (see chapter 2).
- [change_end]
-
- [old_change_begin]
-
- [Function]
- make-string size &key :initial-element
-
- This returns a string (in fact a simple string) of length size, each of whose
- characters has been initialized to the :initial-element argument. If an
- :initial-element argument is not specified, then the string will be initialized
- in an implementation-dependent way.
-
- -------------------------------------------------------------------------------
- Implementation note: It may be convenient to initialize the string to null
- characters, or to spaces, or to garbage (``whatever was there'').
- -------------------------------------------------------------------------------
-
- A string is really just a one-dimensional array of ``string characters'' (that
- is, those characters that are members of type string-char). More complex
- character arrays may be constructed using the function make-array.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type
- string-char and to add a keyword argument :element-type to make-string. The new
- function description is as follows.
-
- [Function]
- make-string size &key :initial-element :element-type
-
- This returns a simple string of length size, each of whose characters has been
- initialized to the :initial-element argument. If an :initial-element argument
- is not specified, then the string will be initialized in an
- implementation-dependent way.
-
- The :element-type argument names the type of the elements of the string; a
- string is constructed of the most specialized type that can accommodate
- elements of the given type. If :element-type is omitted, the type character is
- the default.
-
- X3J13 voted in January 1989 (ARGUMENTS-UNDERSPECIFIED) to clarify that the
- size argument must be a non-negative integer less than the value of
- array-dimension-limit.
- [change_end]
-
- [Function]
- string-trim character-bag string
- string-left-trim character-bag string
- string-right-trim character-bag string
-
- string-trim returns a substring of string, with all characters in character-bag
- stripped off the beginning and end. The function string-left-trim is similar
- but strips characters off only the beginning; string-right-trim strips off only
- the end. The argument character-bag may be any sequence containing characters.
- For example:
-
- (string-trim '(#\Space #\Tab #\Newline) " garbanzo beans
- ") => "garbanzo beans"
- (string-trim " (*)" " ( *three (silly) words* ) ")
- => "three (silly) words"
- (string-left-trim " (*)" " ( *three (silly) words* ) ")
- => "three (silly) words* ) "
- (string-right-trim " (*)" " ( *three (silly) words* ) ")
- => " ( *three (silly) words"
-
- If no characters need to be trimmed from the string, then either the argument
- string itself or a copy of it may be returned, at the discretion of the
- implementation.
-
- [change_begin]
- X3J13 voted in June 1989 (STRING-COERCION) to clarify string coercion (see
- string).
- [change_end]
-
- [Function]
- string-upcase string &key :start :end
- string-downcase string &key :start :end
- string-capitalize string &key :start :end
-
- string-upcase returns a string just like string with all lowercase characters
- replaced by the corresponding uppercase characters. More precisely, each
- character of the result string is produced by applying the function char-upcase
- to the corresponding character of string.
-
- string-downcase is similar, except that uppercase characters are converted to
- lowercase characters (using char-downcase).
-
- The keyword arguments :start and :end delimit the portion of the string to be
- affected. The result is always of the same length as string, however.
-
- The argument is not destroyed. However, if no characters in the argument
- require conversion, the result may be either the argument or a copy of it, at
- the implementation's discretion. For example:
-
- (string-upcase "Dr. Livingstone, I presume?")
- => "DR. LIVINGSTONE, I PRESUME?"
- (string-downcase "Dr. Livingstone, I presume?")
- => "dr. livingstone, i presume?"
- (string-upcase "Dr. Livingstone, I presume?" :start 6 :end 10)
- => "Dr. LiVINGstone, I presume?"
-
- string-capitalize produces a copy of string such that, for every word in the
- copy, the first character of the word, if case-modifiable, is uppercase and any
- other case-modifiable characters in the word are lowercase. For the purposes of
- string-capitalize, a word is defined to be a consecutive subsequence consisting
- of alphanumeric characters or digits, delimited at each end either by a
- non-alphanumeric character or by an end of the string. For example:
-
- (string-capitalize " hello ") => " Hello "
- (string-capitalize
- Ø"occlUDeD cASEmenTs FOreSTAll iNADVertent DEFenestraTION")
- =>\>"Occluded Casements Forestall Inadvertent Defenestration"
- (string-capitalize 'kludgy-hash-search) => "Kludgy-Hash-Search"
- (string-capitalize "DON'T!") => "Don'T!" ;not "Don't!"
- (string-capitalize "pipe 13a, foo16c") => "Pipe 13a, Foo16c"
-
- [change_begin]
- X3J13 voted in June 1989 (STRING-COERCION) to clarify string coercion (see
- string).
- [change_end]
-
- -------------------------------------------------------------------------------
- Compatibility note: Some very approximate Interlisp equivalents to
- string-upcase, string-downcase, and string-capitalize are u-case, l-case with
- second argument nil, and l-case with second argument t.
- -------------------------------------------------------------------------------
-
- [Function]
- nstring-upcase string &key :start :end
- nstring-downcase string &key :start :end
- nstring-capitalize string &key :start :end
-
- These three functions are just like string-upcase, string-downcase, and
- string-capitalize but destructively modify the argument string by altering
- case-modifiable characters as necessary.
-
- The keyword arguments :start and :end delimit the portion of the string to be
- affected. The argument string is returned as the result.
-
- [Function]
- string x
-
- Most of the string functions effectively apply string to such of their
- arguments as are supposed to be strings. If x is a string, it is returned. If x
- is a symbol, its print name is returned.
-
- [old_change_begin]
- If x is a string character (a character of type string-char), then a string
- containing that one character is returned.
- [old_change_end]
-
- [change_begin]
- X3J13 voted in March 1989 (CHARACTER-PROPOSAL) to eliminate the type
- string-char and to redefine the type string to be the union of one or more
- specialized vector types, the types of whose elements are subtypes of the type
- character. Presumably converting a character to a string always works according
- to this vote.
- [change_end]
-
- In any other situation, an error is signaled.
-
- To convert a sequence of characters to a string, use coerce. (Note that (coerce
- x 'string) will not succeed if x is a symbol. Conversely, string will not
- convert a list or other sequence to be a string.)
-
- To get the string representation of a number or any other Lisp object, use
- prin1-to-string, princ-to-string, or format.
-
- [change_begin]
- X3J13 voted in June 1989 (STRING-COERCION) to specify that the following
- functions perform coercion on their string arguments identical to that
- performed by the function string.
-
- string= string-equal string-trim
- string< string-lessp string-left-trim
- string> string-greaterp string-right-trim
- string<= string-not-greaterp string-upcase
- string>= string-not-lessp string-downcase
- string/= string-not-equal string-capitalize
-
- Note that nstring-upcase, nstring-downcase, and nstring-capitalize are absent
- from this list; because they modify destructively, the argument must be a
- string.
-
- As part of the same vote X3J13 specified that string may perform additional
- implementation-dependent coercions but the returned value must be of type
- string. Only when no coercion is defined, whether standard or
- implementation-dependent, is string required to signal an error, in which case
- the error condition must be of type type-error.
- [change_end]
-
- -------------------------------------------------------------------------------
-
-
-
-